Deterministically annealed mixture of experts models for statistical regression

نویسندگان

  • Ajit V. Rao
  • David J. Miller
  • Kenneth Rose
  • Allen Gersho
چکیده

A new and e ective design method is presented for statistical regression functions that belong to the class of mixture models. The class includes the hierarchical mixture of experts (HME) and the normalized radial basis functions (NRBF). Design algorithms based on the maximum likelihood (ML) approach, which emphasize a probabilistic description of the model, have attracted much interest in HME and NRBF models. However, their design objective is mismatched to the original squared-error regression cost and the algorithms are easily trapped by poor local minima on the cost surface. In this paper, we propose an extension of the deterministic annealing (DA) method for the design of mixture-based regression models. We construct a probabilistic framework, but unlike the ML method, we directly optimize the squared-error regression cost, while avoiding poor local minima. Experimental results show that the DA method outperforms standard design methods for both HME and NRBF regression models. 1. MIXTURE OF EXPERTS REGRESSION In recent years, there has been growing interest in learning methods for regression functions that can be statistically interpreted as mixture models or mixture of experts (ME) models. The ME regression function takes the form: g(x) = X j P [jjx]f(x; j); (1) This work was supported in part by the National Science Foundationunder grant no. NCR-9314335, the University of California MICRO program, ACT Networks, Advanced Computer Communications, Stratacom, DSP Group, DSP Software Engineering, Fujitsu, General Electric Companuy, Hughes Electronics, Intel, Moseley Associates, National Semiconductor, Nokia Mobile Phones, Qualcomm, Rockwell International, and Texas Instruments. David Miller was supported by NSF Career Award NSF IRI-9624870 where P [jjx] is a non-negative weight of association between input, x and the jth \local expert regression function", f(x; j). Each local expert, f(x; j), is usually a constant, linear or simple nonlinear function of x and depends on the parameter set, j. The weights of association can be naturally interpreted as a probability distribution since P

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Overview of the New Feature Selection Methods in Finite Mixture of Regression Models

Variable (feature) selection has attracted much attention in contemporary statistical learning and recent scientific research. This is mainly due to the rapid advancement in modern technology that allows scientists to collect data of unprecedented size and complexity. One type of statistical problem in such applications is concerned with modeling an output variable as a function of a sma...

متن کامل

The Family of Scale-Mixture of Skew-Normal Distributions and Its Application in Bayesian Nonlinear Regression Models

In previous studies on fitting non-linear regression models with the symmetric structure the normality is usually assumed in the analysis of data. This choice may be inappropriate when the distribution of residual terms is asymmetric. Recently, the family of scale-mixture of skew-normal distributions is the main concern of many researchers. This family includes several skewed and heavy-tailed d...

متن کامل

Comparison of Ensemble Approaches: Mixture of Experts and AdaBoost for a Regression Problem

Two machine learning approaches: mixture of experts and AdaBoost.R2 were adjusted to the real-world regression problem of predicting the prices of residential premises based on historical data of sales/purchase transactions. The computationally intensive experiments were conducted aimed to compare empirically the prediction accuracy of ensemble models generated by the methods. The analysis of t...

متن کامل

Mixture of experts regression modeling by deterministic annealing

We propose a new learning algorithm for regression modeling. The method is especially suitable for optimizing neural network structures that are amenable to a statistical description as mixture models. These include mixture of experts, hierarchical mixture of experts (HME), and normalized radial basis functions (NRBF). Unlike recent maximum likelihood (ML) approaches, we directly minimize the (...

متن کامل

Using Regression based Control Limits and Probability Mixture Models for Monitoring Customer Behavior

In order to achieve the maximum flexibility in adaptation to ever changing customer’s expectations in customer relationship management, appropriate measures of customer behavior should be continually monitored. To this end, control charts adjusted for buyer’s/visitor’s prior intention to repurchase or visit again are suitable means taking into account the heterogeneity across customers. In the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997